智能论文笔记

Sublinear Algorithms for Hierarchical Clustering

Arpit Agarwal , Sanjeev Khanna , Huan Li , Prathamesh Patil

分类：机器学习

2022-06-15

图形上的分层聚类是数据挖掘和机器学习中的一项基本任务，并在系统发育学，社交网络分析和信息检索等领域中进行了应用。具体而言，我们考虑了由于Dasgupta引起的层次聚类的最近普及的目标函数。以前（大约）最小化此目标函数的算法需要线性时间/空间复杂性。在许多应用程序中，底层图的大小可能很大，即使使用线性时间/空间算法，也可以在计算上具有挑战性。结果，人们对设计只能使用sublinear资源执行全局计算的算法有浓厚的兴趣。这项工作的重点是在三个经过良好的sublinear计算模型下研究大量图的层次聚类，分别侧重于时空，时间和通信，作为要优化的主要资源：（1）（动态）流模型。边缘作为流，（2）查询模型表示，其中使用邻居和度查询查询图形，（3）MPC模型，其中图边缘通过通信通道连接的几台机器进行了分区。我们在上面的所有三个模型中设计用于层次聚类的sublinear算法。我们算法结果的核心是图表中的剪切方面的视图，这使我们能够使用宽松的剪刀示意图进行分层聚类，同时仅引入目标函数中的较小失真。然后，我们的主要算法贡献是如何在查询模型和MPC模型中有效地构建所需形式的切割稀疏器。我们通过建立几乎匹配的下限来补充我们的算法结果，该界限排除了在每个模型中设计更好的算法的可能性。

translated by 谷歌翻译

Online Handbook of Argumentation for AI: Volume 3

Lars Bengel , Elfia Bezou-Vrakatseli , Lydia Blümel , Federico Castagna , Giulia D'Agostino , Daphne Odekerken , Minal Suresh Patil , Jordan Robinson , Hao Wu , Andreas Xydis

分类：人工智能

2022-12-15

This volume contains revised versions of the papers selected for the third volume of the Online Handbook of Argumentation for AI (OHAAI). Previously, formal theories of argument and argument interaction have been proposed and studied, and this has led to the more recent study of computational models of argument. Argumentation, as a field within artificial intelligence (AI), is highly relevant for researchers interested in symbolic representations of knowledge and defeasible reasoning. The purpose of this handbook is to provide an open access and curated anthology for the argumentation research community. OHAAI is designed to serve as a research hub to keep track of the latest and upcoming PhD-driven research on the theory and application of argumentation in all areas related to AI.

translated by 谷歌翻译

Auto-labelling of Bug Report using Natural Language Processing

Avinash Patil , Aryan Jadon

分类：人工智能 | 机器学习

2022-12-13

The exercise of detecting similar bug reports in bug tracking systems is known as duplicate bug report detection. Having prior knowledge of a bug report's existence reduces efforts put into debugging problems and identifying the root cause. Rule and Query-based solutions recommend a long list of potential similar bug reports with no clear ranking. In addition, triage engineers are less motivated to spend time going through an extensive list. Consequently, this deters the use of duplicate bug report retrieval solutions. In this paper, we have proposed a solution using a combination of NLP techniques. Our approach considers unstructured and structured attributes of a bug report like summary, description and severity, impacted products, platforms, categories, etc. It uses a custom data transformer, a deep neural network, and a non-generalizing machine learning method to retrieve existing identical bug reports. We have performed numerous experiments with significant data sources containing thousands of bug reports and showcased that the proposed solution achieves a high retrieval accuracy of 70% for recall@5.

translated by 谷歌翻译

Low-Resource End-to-end Sanskrit TTS using Tacotron2, WaveGlow and Transfer Learning

Ankur Debnath , Shridevi S Patil , Gangotri Nadiger , Ramakrishnan Angarai Ganesan

分类：自然语言处理 | (统计)机器学习

2022-12-07

End-to-end text-to-speech (TTS) systems have been developed for European languages like English and Spanish with state-of-the-art speech quality, prosody, and naturalness. However, development of end-to-end TTS for Indian languages is lagging behind in terms of quality. The challenges involved in such a task are: 1) scarcity of quality training data; 2) low efficiency during training and inference; 3) slow convergence in the case of large vocabulary size. In our work reported in this paper, we have investigated the use of fine-tuning the English-pretrained Tacotron2 model with limited Sanskrit data to synthesize natural sounding speech in Sanskrit in low resource settings. Our experiments show encouraging results, achieving an overall MOS of 3.38 from 37 evaluators with good Sanskrit spoken knowledge. This is really a very good result, considering the fact that the speech data we have used is of duration 2.5 hours only.

translated by 谷歌翻译

Towards Preserving Semantic Structure in Argumentative Multi-Agent via Abstract Interpretation

Minal Suresh Patil

分类：人工智能

2022-11-28

Over the recent twenty years, argumentation has received considerable attention in the fields of knowledge representation, reasoning, and multi-agent systems. However, argumentation in dynamic multi-agent systems encounters the problem of significant arguments generated by agents, which comes at the expense of representational complexity and computational cost. In this work, we aim to investigate the notion of abstraction from the model-checking perspective, where several arguments are trying to defend the same position from various points of view, thereby reducing the size of the argumentation framework whilst preserving the semantic flow structure in the system.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

CoNMix for Source-free Single and Multi-target Domain Adaptation

Vikash Kumar , Rohit Lal , Himanshu Patil , Anirban Chakraborty

分类：机器学习 | 人工智能 | 计算机视觉

2022-11-07

This work introduces the novel task of Source-free Multi-target Domain Adaptation and proposes adaptation framework comprising of \textbf{Co}nsistency with \textbf{N}uclear-Norm Maximization and \textbf{Mix}Up knowledge distillation (\textit{CoNMix}) as a solution to this problem. The main motive of this work is to solve for Single and Multi target Domain Adaptation (SMTDA) for the source-free paradigm, which enforces a constraint where the labeled source data is not available during target adaptation due to various privacy-related restrictions on data sharing. The source-free approach leverages target pseudo labels, which can be noisy, to improve the target adaptation. We introduce consistency between label preserving augmentations and utilize pseudo label refinement methods to reduce noisy pseudo labels. Further, we propose novel MixUp Knowledge Distillation (MKD) for better generalization on multiple target domains using various source-free STDA models. We also show that the Vision Transformer (VT) backbone gives better feature representation with improved domain transferability and class discriminability. Our proposed framework achieves the state-of-the-art (SOTA) results in various paradigms of source-free STDA and MTDA settings on popular domain adaptation datasets like Office-Home, Office-Caltech, and DomainNet. Project Page: https://sites.google.com/view/conmix-vcl

translated by 谷歌翻译

Named Entity Recognition in Indian court judgments

Prathamesh Kalamkar , Astha Agarwal , Aman Tiwari , Smita Gupta , Saurabh Karn , Vivek Raghavan

分类：自然语言处理 | 人工智能

2022-11-07

Identification of named entities from legal texts is an essential building block for developing other legal Artificial Intelligence applications. Named Entities in legal texts are slightly different and more fine-grained than commonly used named entities like Person, Organization, Location etc. In this paper, we introduce a new corpus of 46545 annotated legal named entities mapped to 14 legal entity types. The Baseline model for extracting legal named entities from judgment text is also developed.

translated by 谷歌翻译

On learning history based policies for controlling Markov decision processes

Gandharv Patil , Aditya Mahajan , Doina Precup

分类：机器学习 | (统计)机器学习

2022-11-06

Reinforcementlearning(RL)folkloresuggeststhathistory-basedfunctionapproximationmethods,suchas recurrent neural nets or history-based state abstraction, perform better than their memory-less counterparts, due to the fact that function approximation in Markov decision processes (MDP) can be viewed as inducing a Partially observable MDP. However, there has been little formal analysis of such history-based algorithms, as most existing frameworks focus exclusively on memory-less features. In this paper, we introduce a theoretical framework for studying the behaviour of RL algorithms that learn to control an MDP using history-based feature abstraction mappings. Furthermore, we use this framework to design a practical RL algorithm and we numerically evaluate its effectiveness on a set of continuous control tasks.

translated by 谷歌翻译

A Comprehensive Survey of Regression Based Loss Functions for Time Series Forecasting

Aryan Jadon , Avinash Patil , Shruti Jadon

分类：机器学习 | 人工智能

2022-11-05

Time Series Forecasting has been an active area of research due to its many applications ranging from network usage prediction, resource allocation, anomaly detection, and predictive maintenance. Numerous publications published in the last five years have proposed diverse sets of objective loss functions to address cases such as biased data, long-term forecasting, multicollinear features, etc. In this paper, we have summarized 14 well-known regression loss functions commonly used for time series forecasting and listed out the circumstances where their application can aid in faster and better model convergence. We have also demonstrated how certain categories of loss functions perform well across all data sets and can be considered as a baseline objective function in circumstances where the distribution of the data is unknown. Our code is available at GitHub: https://github.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow.

translated by 谷歌翻译